An Asynchronous Distributed Proximal Gradient Method for Composite Convex Optimization
نویسندگان
چکیده
We propose a distributed first-order augmented Lagrangian (DFAL) algorithm to minimize the sum of composite convex functions, where each term in the sum is a private cost function belonging to a node, and only nodes connected by an edge can directly communicate with each other. This optimization model abstracts a number of applications in distributed sensing and machine learning. We show that any limit point of DFAL iterates is optimal; and for any ǫ > 0, an ǫ-optimal and ǫ-feasible solution can be computed within O(log(ǫ−1)) DFAL iterations, which require O( 1.5 max dmin ǫ−1) proximal gradient computations and communications per node in total, where ψmax denotes the largest eigenvalue of the graph Laplacian, and dmin is the minimum degree of the graph. We also propose an asynchronous version of DFAL by incorporating randomized block coordinate descent methods; and demonstrate the efficiency of DFAL on large scale sparse-group LASSO problems.
منابع مشابه
Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction
In the era of big data, optimizing large scale machine learning problems becomes a challenging task and draws significant attention. Asynchronous optimization algorithms come out as a promising solution. Recently, decoupled asynchronous proximal stochastic gradient descent (DAP-SGD) is proposed to minimize a composite function. It is claimed to be able to offload the computation bottleneck from...
متن کاملAn Asynchronous Distributed Proximal Gradient Method for Composite Convex Optimization
xi=x̄i when ‖∇xif(x̄)‖2 ≤ λBi, it follows that x̄i = x̄i if and only if ‖∇xif(x̄)‖2 ≤ λBi. Hence, hi(x̄ ∗ i ) = 0. Case 2: Suppose that i ∈ Ic := N \ I, i.e., ‖∇xif(x̄)‖2 > λBi. In this case, x̄i 6= x̄i. From the first-order optimality condition, we have ∇xif(x̄) + Li(x̄i − x̄i) + λBi x̄ ∗ i −x̄i ‖x̄i −x̄i‖2 = 0. Let si := x̄∗i −x̄i ‖x̄i −x̄i‖2 and ti := ‖x̄i − x̄i‖2, then si = −∇xif(x̄) Liti+λBi . Since ‖si‖2 = 1, i...
متن کاملAsynchronous Doubly Stochastic Proximal Optimization with Variance Reduction
In the big data era, both of the sample size and dimension could be huge at the same time. Asynchronous parallel technology was recently proposed to handle the big data. Specifically, asynchronous stochastic (variance reduction) gradient descent algorithms were recently proposed to scale the sample size, and asynchronous stochastic coordinate descent algorithms were proposed to scale the dimens...
متن کاملVariance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization
Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization assumes convexity or strong convexity of each function. In this paper, we extend this problem into the non-convex setting using variance reduction techniques, ...
متن کاملAn Accelerated Proximal Coordinate Gradient Method
We develop an accelerated randomized proximal coordinate gradient (APCG) method, for solving a broad class of composite convex optimization problems. In particular, our method achieves faster linear convergence rates for minimizing strongly convex functions than existing randomized proximal coordinate gradient methods. We show how to apply the APCG method to solve the dual of the regularized em...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015